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1. INTRODUCTION 


In survey sampling, statisticians often come across the study of variables which have highly 
skewed distributions, such as income, expenditure etc. In such situations, the estimation of 
median deserves special attention. Kuk and Mak (1989) are the first to introduce the estimation 
of population median of the study variate Y using auxiliary information in survey sampling. 
Francisco and Fuller (1991) have also considered the problem of estimation of the median as part 
of the estimation of a finite population distribution function. Later Singh et al (2001) have dealt 
extensively with the problem of estimation of median using auxiliary information on an auxiliary 
variate in two phase sampling. 


Consider a finite population U={1,2,...,i,....N}. Let Y and X be the variable for study and 
auxiliary variable, taking values Y; and X; respectively for the i-th unit. When the two variables 
are strongly related but no information is available on the population median Mx of X, we seek 
to estimate the population median My of Y from a sample Sm, obtained through a two-phase 
selection. Permitting simple random sampling without replacement (SRSWOR) design in each 
phase, the two-phase sampling scheme will be as follows: 


(i) The first phase sample S,(S,cU) of fixed size n is drawn to observe only X in 
order to furnish an estimate of Mx. 


(11) Given S,, the second phase sample Sm(SmCS,) of fixed size m is drawn to observe 
Y only. 


Assuming that the median Mx of the variable X is known, Kuk and Mak (1989) suggested a ratio 
estimator for the population median My of Y as 


M, =M, My 
My 





(1.1) 


where M y and M y are the sample estimators of My and Mx respectively based on a sample Sm 


of size m. Suppose that ya), Y2), ..-, Ym) are the y values of sample units in ascending order. 
Further, let t be an integer such that Y < My <Y¢+1) and let p=t/m be the proportion of Y, values 
in the sample that are less than or equal to the median value My, an unknown population 


parameter. If p is a predictor of p, the sample median M ycan be written in terms of quantities 
as Ô, (ô) where p =0.5. Kuk and Mak (1989) define a matrix of proportions (Pij(x,y)) as 





and a position estimator of My given by 























Y < My Y > My Total 

X < Mx Pii(x,y) P2i(X,y) P(x,y) 

X > Mx Pi2(x,y) P22(x,y) P.o(x,y) 
Total P1-(x,y) P>-(x,y) 1 








M,” = 0,(p,) (1.2) 








where Py = 1 Pin y) + CSTE Pins y) 
m| P(x,y) p(X y) 
= — y) +(m=m, ) Py (x, y) 
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with p,(x,y) being the sample analogues of the Pij(x,y) obtained from the population and mx the 


number of units in S,, with X < Mx. 


Let EO) and Fy (y) denote the proportion of units in the sample S with X < Mx, and 
X>Mx, respectively that have Y values less than or equal to y. Then for estimating My, Kuk and 
Mak (1989) suggested the 'stratification estimator’ as 


M, =inffy: EO 20.5} (1.3) 
P l Tro, rov 
where F, (y) = Fa ie 


It is to be noted that the estimators defined in (1.1), (1.2) and (1.3) are based on prior knowledge 
of the median Mx of the auxiliary character X. In many situations of practical importance the 
population median Mx of X may not be known. This led Singh et al (2001) to discuss the 
problem of estimating the population median My in double sampling and suggested an analogous 
ratio estimator as 





(1.4) 


where M x is sample median based on first phase sample Sn. 


Sometimes even if Mx is unknown, information on a second auxiliary variable Z, closely related 
to X but compared X remotely related to Y, is available on all units of the population. This type 
of situation has been briefly discussed by, among others, Chand (1975), Kiregyera (1980, 84), 
Srivenkataramana and Tracy (1989), Sahoo and Sahoo (1993) and Singh (1993). Let Mz be the 
known population median of Z. Defining 


M M M,' M M,' 
aol ajeli sela Near Jae, = = 
Y X x Z Z 











such that E(e,)=0 and | ek | <1 for k=0,1,2,3; where M , and M 5 are the sample median 


estimators based on second phase sample Sm and first phase sample Sa. Let us define the 
following two new matrices as 












































Z<Mz Z> Mz Total 
X < Mx P11(X,Z) Po1(X,Z) P.1(x,Z) 
X > Mx P12(x,z) P2(X,Z) P.o(X,Z) 

Total P,-(x,z) Po-(X,Z) 1 

and 

Z < Mz Z > Mz Total 
Y < My Piily,z) Po1(y,Z) P.i(y,Z) 
Y > My Pio(y,z) Po2(y,Z) P.o(y,Z) 

Total Pi -(y,z) P2-(y,Z) 1 




















Using results given in the Appendix-1, to the first order of approximation, we have 
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E(e2e4) = oa (4n) {4P11(x,z)-1 HMxMzfx(Mx)fz(Mz2)}', 


N-n g È 
E(e3e4) = ca (4n) '(fz(Mz)Mz)~ 
where it is assumed that as N—>œ the distribution of the trivariate variable (X,Y,Z) approaches a 
continuous distribution with marginal densities fx(x), fy(y) and fz(z) for X, Y and Z respectively. 
This assumption holds in particular under a superpopulation model framework, treating the 


values of (X, Y, Z) in the population as a realization of N independent observations from a 
continuous distribution. We also assume that fy(My), fx(Mx) and fz(Mz) are positive. 


Under these conditions, the sample median M yis consistent and asymptotically normal (Gross, 
1980) with mean My and variance 


[Pa fe ir 


In this paper we have suggested a class of estimators for My using information on two auxiliary 
variables X and Z in double sampling and analyzes its properties. 


2. SUGGESTED CLASS OF ESTIMATORS 


Motivated by Srivastava (1971), we suggest a class of estimators of My of Y as 
=, :-M, =M,sv)} (2.1) 


M M! 
where u = Wt ,v =—~ and g(u,v) is a function of u and v such that g(1,1)=1 and such that it 
X Z 


satisfies the following conditions. 








l. Whatever be the samples (Sa and Sm) chosen, let (u,v) assume values in a closed convex 
sub-space, P, of the two dimensional real space containing the point (1,1). 


2. The function g(u,v) is continuous in P, such that g(1,1)=1. 
3. The first and second order partial derivatives of g(u,v) exist and are also continuous in P. 


Expanding g(u,v) about the point (1,1) in a second order Taylor's series and taking expectations, 
it is found that 


aA 


r, ®)= M, +0(n!) 


so the bias is of order n™!. 


Using a first order Taylor's series expansion around the point (1,1) and noting that g(1,1)=1, we 
have 


mM, =M,[l+e, +(e, -—é, )g,(.1)+ e,g,(I1)+ O(n") 
or 
(m -M,)= My le, + (e, =E )g,(L1)+ e,8,(11)] (2.2) 


where gı(1,1) and g2(1,1) denote first order partial derivatives of g(u,v) with respect to u and v 
respectively around the point (1,1). 


Squaring both sides in (2.2) and then taking expectations, we get the variance of M A to the 
first degree of approximation, as 
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where 
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The variance of M y. in (2.3) is minimized for 





(2.6) 





Thus the resulting (minimum) variance of M 6 ) 
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is given by 





(2.7) 


Now, we proved the following theorem. 


Theorem 2.1 - Up to terms of order n“, 
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with equality holding if 
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It is interesting to note that the lower bound of the variance of M fe ) at (2.1) is the variance of 


the linear regression estimator 


M, =m, +å —M, )+d,(u,-M!) (2.8) 
where 
d, m a apa y)-1), 
d, = Lt, (.2)-0 


with p,,(x,y) and #,,(y,z) being the sample analogues of the p,,(x,y) and p,,(y,z) 
respectively and a (ú y i f (M; ) and (M, ) can be obtained by following Silverman (1986). 


Any parametric function g(u,v) satisfying the conditions (1), (2) and (3) can generate an 
asymptotically acceptable estimator. The class of such estimators are large. The following 
simple functions g(u,v) give even estimators of the class 
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Let the seven estimators generated by g® (u,v) be denoted by M To =M iE O (u,v), (i=1to07). It 


is easily seen that the optimum values of the parameters &,ßB,w;(i-1,2) are given by the right hand 
sides of (2.6). 


3. A WIDER CLASS OF ESTIMATORS 


The class of estimators (2.1) does not include the estimator 
My, =M, +a (4 -mM )+a, (M, -M!)(d,,d,) 
being constants. 
However, it is easily shown that if we consider a class of estimators wider than (2.1), defined by 
M,° =G,(M,,u,v) 68.1) 


of My, where G(-) is a function of M, „u and v such that G(M,,L1)=M, and G,(M,,L1)=1. 
G,(M,,,1,1) denoting the first partial derivative of G(-) with respect to M os 


Proceeding as in Section 2 it is easily seen that the bias of M a is of the order n™' and up to this 
(G 


order of terms, the variance of M - is given by 
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(3.2) 
where G2(My1,1) and G3(My1,1) denote the first partial derivatives of u and v respectively 
around the point (My,(1,1). 

The variance of M a is minimized for 
o, (m, 1)=—| MLD ap, 6, »)-1) 
f,(M,) 
(u,) (3.3) 
Mzf 
G, (M; 1,1) =—-| ———— Z {4P,(y,z)-1 
=| RA lanbe) 


Substitution of (3.3) in (3.2) yields the minimum variance of M ae as 





min. Var, )=—— (; - T may f4Ra G2) 


—A(f,(M,) \m N 
= min. Var(M,® ) 

















(3.4) 
Thus we established the following theorem. Theorem 3.1 - Up to terms of order n’, 
~ (G) 1 1 1 1 1 2 (1 1 2 
Var\M 2 4P (x, yJ-l) -| --— MP b,z)-1 
ar Y ) a(f, (M, È a 5 7 Aes y) ) n N 1 (y z) ) 
with equality holding if 
fM, M x 
G,(M, LI= - AP (x, y)-1 
ota Ga farted 
Mz fz(Mz) 
G,(M, ,1,1)=—-| ——— _ \4P,(y,z)-1 
=| Mater far (os)-D 
If the information on second auxiliary variable z is not used, then the class of estimators M oO 
reduces to the class of estimators of My as 
mM,” =H(M,,u) (3.5) 
where H(M,,u) is a function of (Mu) such that H(M,,1)=M, and H,(M,,1)=1, 
H (M,1)= oH() . The estimator M a is reported by Singh et al (2001). 
Y I(my.1) 
The minimum variance of M s to the first degree of approximation is given by 
l a l it hatte 1 i 
min. Var|M = 4P (x,y)—1 (3.6) 
iea a e a 
From (3.4) and (3.6) we have 
minVar(wt,”)— min.Var(wty°) )= East : ~(4P,,(y,z)-1) (3.7) 
n N )a(f,(M,)) 


which is always positive. Thus the proposed class of estimators M © is more efficient than the 


estimator M,“ considered by Singh et al (2001). 


4. ESTIMATOR BASED ON ESTIMATED OPTIMUM VALUES 


We denote 
My f(My) 
tS AP, (x, 1 
M f,(M,) ts ») ) (4.1) 
M,f,(M;) l 
Q, = mo u oe?) 1) 


In practice the optimum values of gı(1,1)(=-0u) and g2(1,1)(=-02) are not known. Then we use 
to find out their sample estimates from the data at hand. Estimators of optimum value of gı(1,1) 
and g2(1,1) are given as 


ĉi (1,1)= —G,, (4.2) 
&,(L1)=-é, l 
where 
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Now following the procedure discussed in Singh and Singh (19xx) and Srivastava and Jhajj 
(1983), we define the following class of estimators of My (based on estimated optimum) as 
M,” =M,¢*(uv.0,,0,) (4.4) 


where g*(-) is a function of (u,v,@,,@,) such that 
g*(LLa@,)=1 
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and such that it satisfies the following conditions: 


1. Whatever be the samples (Sn and Sm) chosen, let u,v,@,@, assume values in a closed 


convex sub-space, S, of the four dimensional real space containing the point (1,1,04,02). 


2. The function g*(u,v, 01, OQ) continuous in S. 


3. The first and second order partial derivatives of g *(u,v,0,,@, ) exst. and are also 


continuous in S. 


Under the above conditions, it can be shown that 
E(M,")=M, +0ln") 
and to the first degree of approximation, the variance of M te ° is given by 
Varm, )= min.Var(WLy* ) 
where min. Varm, ©) is given in (2.7). 
A wider class of estimators of My based on estimated optimum values is defined by 
M,e” =G* (M, uv â a3) 


where 


are the estimates of 


._M,f,(My) 
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and G*(-) is a function of m, „u,v, 0) such that 
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G*(m, 11,07 ,03)=M, 
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Under these conditions it can be easily shown that 


A 


E(u, =m, +0(n“) 
and to the first degree of approximation, the variance of M {© is given by 
Varl,” )= min. Var, ©) (4.9) 


where min.Var(M ac is given in (3.4). 
It is to be mentioned that a large number of estimators can be generated from the classes M o 0 


and M re based on estimated optimum values. 
5. EFFICIENCY OF THE SUGGESTED CLASS OF ESTIMATORS FOR FIXED COST 


The appropriate estimator based on on single-phase sampling without using any auxiliary 
variable is M y» whose variance is given by 


valis) 


(5.1) 





In case when we do not use any auxiliary character then the cost function is of the form Co-mC,, 
where Co and C; are total cost and cost per unit of collecting information on the character Y. 


The optimum value of the variance for the fixed cost Co is given by 


ont vali, J=v, [= F xl (5.2) 
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When we use one auxiliary character X then the cost function is given by 

C, =Gm+C,n, (5.4) 
where C> is the cost per unit of collecting information on the auxiliary character Z. 
The optimum sample sizes under (5.4) for which the minimum variance of M ae is optimum, 
are 
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where V|=Vo0(4P, i(xy)-1)”. 


Putting these optimum values of m and n in the minimum variance expression of M A in (3.6), 


we get the optimum min. Var(M w) as 
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Similarly, when we use an additional character Z then the cost function is given by 


C, =C,m+(C, +C,)n (5.8) 


where C3 is the cost per unit of collecting information on character Z. 


It is assumed that C;>C2>C3. The optimum values of m and n for fixed cost Co which minimizes 
the minimum variance of M (ort Aa ) (2.7) (or (3.4)) are given by 
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The optimum variance of M » (orð ae) corresponding to optimal two-phase sampling strategy 
is 
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Assuming large N, the proposed two phase sampling strategy would be profitable over single 
phase sampling so long as 
lopt. Var(W7, j| > Opt |min. Var © br min. Varg, © J 
ae C, +C, < TASNA -V, 
C, y V, = V, 


When N is large, the proposed two phase sampling is more efficient than that Singh et al (2001) 
strategy if 
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6. GENERALIZED CLASS OF ESTIMATORS 


We suggest a class of estimators of My as 


s=, ©: a,” =F(M,,u,v,w} (6.1) 


where u=M,/M’.,v=M/,/M,,w=M,/M, and the function F(-) assumes a value in a 
bounded closed convex subset WC%4, which contains the point (My,1,1,1)=T and is such that 
F(T)=My=>F,(T)=1, Fi(T) denoting the first order partial derivative of F(-) with respect to M > 
around the point T=(My,1,1,1). Using a first order Taylor's series expansion around the point T, 
we get 


M, =F(T)+(M, =M, )F(1) + (u-DF, (1) + (v-DF,(1) + (w- DF, (7) + 000") 
(6.2) 


where F2(T), F3(T) and F4(T) denote the first order partial derivatives of F (7 pale, w) with 
respect to u, v and w around the point T respectively. Under the assumption that F(T)=My and 
F,(T)=1, we have the following theorem. 


Theorem 6.1. Any estimator in S is asymptotically unbiased and normal. 


Proof: Following Kuk and Mak (1989), let Py, Px and Pz denote the proportion of Y, X and Z 
values respectively for which YSMy, XSMx and Z<Mz; then we have 
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Using these expressions in (6.2), we get the required results. 


Expression (6.2) can be rewritten as 
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Squaring both sides of (6.3) and then taking expectation, we get the variance of M oe to the first 
degree of approximation, as 
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The Varm ”) at (6.4) is minimized for 
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Thus the resulting (minimum) variance of M ee is given by 
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where 
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and min. Var(W7 ) is given in (3.4) 


Expression (6.6) clearly indicates that the proposed class of estimators M r) is more efficient 


than the class of estimator M ss or (7 ®©) and hence the class of estimators M ae suggested 


by Singh et al (2001) and the estimator M y at its optimum conditions. 
The estimator based on estimated optimum values is defined by 
pm, M,” = F*(ņ, u,v, wâ, â, â} (6.8) 


where 
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are the sample estimates of aj, a2 and a3 given in (6.5) respectively, F*(-) is a function of 
M,,u,v,w,4,,4,,4, ) such that 
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where T* = (My,1,1,1,a1,a2,a3) 
Under these conditions it can easily be shown that 
E(u," )=m, +0(n7) 


and to the first degree of approximation, the variance of M En is given by 


Var") = min. Varl," ) (6.10) 


where min. Varm, ©) is given in (6.6). 


Under the cost function (5.8), the optimum values of m and n which minimizes the minimum 
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opt — (6.11) 
P IV, -V, -V, Jc, +40, -V, -V;)(C, +C) 
m = Co (V, -V, -V,)/C, 
i V, -V, -V,)C, +/(V, -V, +V, XC, +C, )] 
where 
D’V, 
= (6.12) 
[Ge @2)-1)] 
for large N, the optimum value of min. Var( 7 A) is given by 
Á V-V -V. V, -V, +V, XC, +C 
Opt |min. Var Oye NG oy Aah +/( 1 2+ si 2+ J| (6.13) 


Co 


The proposed two-phase sampling strategy would be profitable over single phase-sampling so 
long as Opt |Var(17, I> Opt {min. Var(M1,”}| 








2 
C, +C, < Ve -Vo -V me (6 14) 
ĉi 4 Vi -V, +V, 


It follows from (5.7) and (6.13) that 


Opt. |min, Varm,” I< Opt. |min, Varl, ny) 


NV ane ps ah V, a aa 

















JV, -V +V, -V,+V,)C, C 


for large N. 


Further we note from (5.11) and (6.13) that 


Opt.[min.Var(m ee J < Opt |min. Var(7 y orm , ° )| 








C+, [WW =V)-WU=V=V)) 


if 
C, JV, -V, +V;)- V, -V; 








(6.16) 
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